AITopics | average return

Similarly to [6], we consider that all environments have the same underlying Structural Causal Model (SCM) and that the different environments correspond to different interventions on the SCM. We provide here the formal definition for SCMs and interventions. We say that Xi causes Xj if Xi 2Pa(Xj). Definition A.2. (Intervention) [6]: Consider a SCMC =( S,N). An intervention e on C consists of replacing one or several of its structural equations to obtain an intervened SCMCe =( Se,N e) with structural equations: Sej: Xej fj(Pa(Xej),N ej), for j =1,...m (11) The variable Xe is intervened on if Si 6= Sei or Ni 6= Nei .

artificial intelligence, different environment, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)

Add feedback

a03caec56cd82478bf197475b48c05f9-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 08:01:00 GMT

artificial intelligence, machine learning, trajectory, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

7967cc8e3ab559e68cc944c44b1cf3e8-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 01:14:34 GMT

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Add feedback

MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning

Neural Information Processing SystemsFeb-8-2026, 22:47:47 GMT

MetaBox offers a flexible algorithmic template that allows users to effortlessly implement their unique designs within the platform.

evolutionary algorithm, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > China (0.04)
North America > United States > Florida > Alachua County > Gainesville (0.04)
(3 more...)

Industry: Transportation > Air (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

204904e461002b28511d5880e1c36a0f-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 19:24:14 GMT

different environment, noise variable, test environment, (15 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

03255088ed63354a54e0e5ed957e9008-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 07:55:51 GMT

algorithm, experiment, mage, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Huo, Yingxiao, Dash, Satya Prakash, Stoican, Radu, Kaski, Samuel, Sun, Mingfei

arXiv.org Machine LearningJan-27-2026

Natural gradients have long been studied in deep reinforcement learning due to their fast convergence properties and covariant weight updates. However, computing natural gradients requires inversion of the Fisher Information Matrix (FIM) at each iteration, which is computationally prohibitive in nature. In this paper, we present an efficient and scalable natural policy optimization technique that leverages a rank-1 approximation to full inverse-FIM. We theoretically show that under certain conditions, a rank-1 approximation to inverse-FIM converges faster than policy gradients and, under some conditions, enjoys the same sample complexity as stochastic policy gradient methods. We benchmark our method on a diverse set of environments and show that it achieves superior performance to standard actor-critic and trust-region baselines.

artificial intelligence, machine learning research, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2601.18626

Country:

North America > Canada > British Columbia (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Beyond Average Return in Markov Decision Processes

Neural Information Processing SystemsDec-26-2025, 14:02:03 GMT

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes?In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics. We summarize the characterization of these classes for policy evaluation, and give a new answer for the planning problem. Interestingly, we prove that only generalized means can be optimized exactly, even in the more general framework of Distributional Reinforcement Learning (DistRL).DistRL permits, however, to evaluate other functionals approximately. We provide error bounds on the resulting estimators, and discuss the potential of this approach as well as its limitations.These results contribute to advancing the theory of Markov Decision Processes by examining overall characteristics of the return, and particularly risk-conscious strategies.

average return, markov decision process, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

average return

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

fe90657b12193c7b52a3418bdc351807-Paper-Conference.pdf

232eee8ef411a0a316efa298d7be3c2b-Paper-Datasets_and_Benchmarks.pdf

204904e461002b28511d5880e1c36a0f-Supplemental.pdf

a03caec56cd82478bf197475b48c05f9-Supplemental.pdf

7967cc8e3ab559e68cc944c44b1cf3e8-Supplemental.pdf

MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning

204904e461002b28511d5880e1c36a0f-Supplemental.pdf

03255088ed63354a54e0e5ed957e9008-Supplemental.pdf

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Beyond Average Return in Markov Decision Processes